A Text Retrieval Package for the Unix Operating System
نویسنده
چکیده
This paper describes lq-text, an inverted index text retrieval package written by the author. Inverted index text retrieval provides a fast and effective way of searching large amounts of text. This is implemented by making an index to all of the natural-language words that occur in the text. The actual text remains unaltered in place, or, if desired, can be compressed or archived; the index allows rapid searching even if the data files have been altogether removed. The design and implementation of lq-text are discussed, and performance measurements are given for comparison with other text searching programs such as grep and agrep. The functionality provided is compared briefly with other packages such as glimpse and zbrowser. The lq-text package is available in source form, has been successfully integrated into a number of other systems and products, and is in use at over 100 sites.
منابع مشابه
Image retrieval using the combination of text-based and content-based algorithms
Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input...
متن کاملFree Text Information Retrieval: An Assessment of Publicly Available Unix-Based Systems
Arisingout of interest inapproaches for content-based image retrieval from theHST image archive, we prototyped free text retrieval systems using accepted proposal abstracts (hence: what the later associated images were meant to deal with). Other possible applications of such free text retrieval systems in observatory operations are described. Weusedwidely available, publicly accessible, (Sun) U...
متن کاملSemiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملUser-Level Threads and Interprocess Communication
User-level threads have performance and exibility advantages over both Unix-like processes and kernel threads. However, the performance of user-level threads may su er in multiprogrammed environments, or when threads block in the kernel (e.g., for I/O). These problems can be particularly severe in tasks that communicate frequently using IPC (e.g., multithreaded servers), due to interactions bet...
متن کاملOS Support for VLDBs: Unix Enhancements for the Teradata Data Base
This paper presents the parallel enhancements which allowed the port of the Teradata Database from TOS, a proprietary ldbit Operating System, to an SVR4 Unix system. It gives an architectural overview of how the Teradata Database solves the main VLDB problems: performance and reliability. We will present he transition from the Database Computer DBC/lOlZ nodes (Interface Processors-IFPs and Acce...
متن کامل